Big Data Now: 2012 Edition by Inc. O’Reilly Media
Author:Inc. O’Reilly Media [O’Reilly Media, Inc.]
Language: eng
Format: mobi
Tags: COMPUTERS / Data Processing
ISBN: 9781449356675
Publisher: O’Reilly Media
Published: 2012-10-23T04:00:00+00:00
The first law is closely related to the “bike shed effect” (also known as Parkinson’s Law of Triviality) which states that, “the time spent on any item of the agenda will be in inverse proportion to the sum involved.”
In other words, if you try to build a simple thing such as a public bike shed, there will be endless town hall discussions wherein people argue over trivial details such as the color of the door. But if you want to build a nuclear power plant — a project so vast and complicated that most people can’t understand it — people will defer to expert opinion.
Such is the case with statistics.
If you make the mistake of going into the comments section of any news piece discussing a scientific finding, invariably someone will leave the comment, “correlation does not equal causation.”
We’ll go ahead and call that truism Voytek’s fourth law.
But people rarely have the capacity to argue against the methods and models used by, say, neuroscientists or cosmologists.
But sometimes we get perfect models without any understanding of the underlying processes. What do we learn from that?
The always fantastic Radiolab did a follow-up story on the Schmidt and Lipson “automated science” research in an episode titled “Limits of Science.” It turns out, a biologist contacted Schmidt and Lipson and gave them data to run their algorithm on. They wanted to figure out the principles governing the dynamics of a single-celled bacterium. Their result?
Well sometimes the stories we tell with data…they just don’t make sense to us.
They found “two equations that describe the data.”
But they didn’t know what the equations meant. They had no context. Their variables had no meaning. Or, as Radiolab co-host Jad Abumrad put it, “the more we turn to computers with these big questions, the more they’ll give us answers that we just don’t understand.”
So while big data projects are creating ridiculously exciting new vistas for scientific exploration and collaboration, we have to take care to avoid the Paradox of Information wherein we can know too many things without knowing what those “things” are.
Because at some point, we’ll have so much data that we’ll stop being able to discern the map from the territory. Our goal as (data) scientists should be to distill the essence of the data into something that tells as true a story as possible while being as simple as possible to understand. Or, to operationalize that sentence better, we should aim to find balance between minimizing the residuals of our models and maximizing our ability to make sense of those models.
Recently, Stephen Wolfram released the results of a 20-year long experiment in personal data collection, including every keystroke he’s typed and every email he’s sent. In response, Robert Krulwich, the other co-host of Radiolab, concludes by saying “I’m looking at your data [Dr. Wolfram], and you know what’s amazing to me? How much of you is missing.”
Personally, I disagree; I believe that there’s a humanity in those numbers and that Mr. Krulwich is falling prey to the idea that science somehow ruins the magic of the universe.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
The Mikado Method by Ola Ellnestam Daniel Brolund(26291)
Hello! Python by Anthony Briggs(25216)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(24446)
Kotlin in Action by Dmitry Jemerov(23536)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(22880)
Dependency Injection in .NET by Mark Seemann(22667)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(21431)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(20273)
Grails in Action by Glen Smith Peter Ledbrook(19343)
Adobe Camera Raw For Digital Photographers Only by Rob Sheppard(17056)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(16366)
Secrets of the JavaScript Ninja by John Resig & Bear Bibeault(14077)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(12255)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(11533)
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(10644)
Hit Refresh by Satya Nadella(9220)
The Kubernetes Operator Framework Book by Michael Dame(8579)
Exploring Deepfakes by Bryan Lyon and Matt Tora(8432)
Robo-Advisor with Python by Aki Ranin(8376)